Topic Detection through Statistical Methods

نویسندگان

  • Frederick George
  • Frederick George Walls
  • Carrie Daake
چکیده

A system is developed to group news stories together according to topic. Several clustering algorithms can be used to group related stories into clusters. The clustering algorithms used require two types of metrics: metrics that, given a story and a set of clusters, can find the most topical cluster for that story; or metrics that can help decide whether or not a given story is on the same topic as a cluster. These metrics are derived by combining simple similarity metrics that compare stories and groups of stories. Finally, methods are proposed for evaluating the story groupings, and experimental results are reported based on these methods. Thesis Supervisor: Victor Zue Title: Associate Director of LCS and Senior Research Scientist

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Semantic Frame-based Statistical Approach for Topic Detection

We propose a statistical frame-based approach (FBA) for natural language processing, and demonstrate its advantage over traditional machine learning methods by using topic detection as a case study. FBA perceives and identifies semantic knowledge in a more general manner by collecting important linguistic patterns within documents through a unique flexible matching scheme that allows word inser...

متن کامل

Robust Statistical Methods for Automated Outlier Detection

The computational challenge of automating outlier, or blunder point, detection in radio metric data requires the use of nonstandard statistical methods because the outliers have a deleterious effect upon standard least squares methods. The particular nonstandard methods most applicable to the task are the robust statistical techniques that have undergone intense development since the 1960s. The...

متن کامل

Monitoring of Social Network and Change Detection by Applying Statistical Process: ERGM

The statistical modeling of social network data needs much effort  because of the complex dependence structure of the tie variables. In order to formulate such dependences, the statistical exponential families of distributions can provide a flexible structure. In this regard, the statistical characteristics of the network is provided to be encapsulated within an Exponential Random Graph Model (...

متن کامل

Community Detection Based on Topic Distance in Social Tagging Networks

Research on the community detection in social tagging networks has attracted much attention in the last decade. Extracting the hidden topic information from tags provides a new way of thinking for community detection in social tagging networks. In this paper, a topic tagging network by extracting several topics from the tags through using the Latent Dirichlet Allocation (LDA) model is built fir...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013